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GRAPH BOOTSTRAP PERCOLATION 

JOZSEF BALOGH, BELA BOLLOBAS, AND ROBERT MORRIS 

Abstract. Graph bootstrap percolation is a deterministic cellular automaton which was 
introduced by BoUobas in 1968, and is defined as follows. Given a graph H , and a set 
G C E{Kn) of initially 'infected' edges, we infect, at each time step, a new edge e if there 
is a copy of H in Kn such that e is the only not-yet infected edge of H . We say that G 
percolates in the iJ-bootstrap process if eventually every edge of Kn is infected. 

The extremal questions for this model, when H is the complete graph Kr^ were solved 
[~^ ' (independently) by Alon, Kalai and Frankl almost thirty years ago. In this paper we study 

the random questions, and determine the critical probability pc{n,Kr) for the _ftr,.-process 

Oup to a poly-logarithmic factor. In the case r = 4 we prove a stronger result, and determine 
the threshold for pdn^Ki). 

-(— > 
C^ . 

1. Introduction 

Cellular automata, which were introduced by von Neumann (see [5U]) after a suggestion of 
Ulam [22], are dynamical systems (defined on a graph G) whose update rule is homogeneous 
and local. We shall study a particular cellular automaton, called //-bootstrap percolation, 

OO ! which was introduced over 40 years ago by Bollobas [13]. This model is a substantial gen- 

eralization of r-neighbour bootstrap percolation (see below), an extensively studied model 
related to statistical physics. We shall determine the critical probability for ii'r-percolation 

Q ■ up to a poly-logarithmic factor for every r ^ 4 and moreover, using a completely different 

method, we shall determine the threshold for percolation in the case r = 4. 

Given a graph H, we define H -bootstrap percolation (or H -edge-bootstrap percolation) as 
follows. Given a set G C E{Kn) of initially 'infected' edges on vertex set [n] (that is, given 

^ . a graph), we set Gq = G and define, for each t ^ 0, 

Gt+i := GtuLe E{Kn) : 3/7 with eeH dGtU {e}\. 

In words, this says that an edge e becomes infected at time t + 1 if there exists a copy of 
H in Kn for which e is the only uninfected edge at time t. Let {G)h = Ui ^t denote the 
closure of G under the if-bootstrap process, and say that G percolates (or if-percolates) in 
Kn if {G)h = E{Kn). 

The if -bootstrap process was introduced over 40 years ago by Bollobas [T3] (see also [H]), 
under the name 'weak saturation'. He conjectured that if a graph G percolates in the /re- 
process, then G has at least (2) — {"~l^ ) edges, and, building on work in [12], proved his 
conjecture when r ^ 7. For general r, the conjecture was proved using linear algebraic 
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methods by Alon [IJ, Frankl [22] and Kalai [25]. See [H] for more recent extremal results, on 
a closely related process, using such methods. 

In this paper, we shall study the if-bootstrap process in the random setting, i.e., when 
the initial graph G is chosen to be Gn,p- Apart from its intrinsic interest, this question is 
motivated by the following, closely related cellular automaton, which was introduced in 1979 
by Chalupa, Leath and Reich [I7j in the context of disordered magnetic systems, and for 
which our process is named. Given an underlying graph G, an integer r and a set of infected 
vertices A C V{G), set Aq = A and let 

A+i := At U {i; G V{G) : \N{v) H A| ^ r} 

for each t ^ 0; that is, we infect a vertex if it has at least r already- infected neighbours. Say 
that the set A percolates if the entire vertex set is eventually infected. This process is known 
as r-neighbour bootstrap percolation, and has been extensively studied by mathematicians 
(see, for example, [3l O [161 [2Sl ISHl El]), physicists (see [2], and the references therein) and 
sociologists [23|, [33], amongst others. It has moreover found applications in the Glauber 
Dynamics of the Ising model (see [211 [29j). 

The r-neighbour bootstrap model is usually studied in the random setting, where the 
main question is to determine the critical threshold at which percolation occurs. To be 
precise, if V{G) = [n] and the elements of A C V{G) are chosen independently at random, 
each with probability p, then one aims to determine the value Pc oi p = p{n) at which 
percolation becomes likely. Sharp bounds on p^. have recently been determined in several 
cases of particular interest, such as [rif (see [5l [6l [71 [H [211 [25]), on a large family of 'two- 
dimensional' graphs [18], on trees [lOl [20], and on various types of random graph [HI [27] . 
In each case, it was shown that the critical probability has a sharp threshold. 

Motivated by these results, let us define the critical threshold for iJ-bootstrap percolation 
on Kr, as follows: 



'■n 



Pc 



{n,H) := inf|p:P((G„,p)H = i^„) ^1/2}, 



where Gn,p is the Erdos-Renyi random graph, obtained by choosing each edge independently 
with probability p. (For background on the theory of Random Graphs, see [H].) Our aim is 
to determine pdn, H) for every graph H . Here we shall study the case H = Kr, the complete 
graph; our main theorems partially solve Problem 1 of [T5] . 

In order to aid the reader's intuition, let us first consider the case H = K^, which is trivial. 
Indeed, it is easy to see that G percolates in the i^3-process if and only if G is connected. 
It is well-known (see [2]) that, with high probability, Gn,p is connected if and only if it has 
no isolated vertex; thus, a straightforward calculation gives the following theorem of Erdos 
and Renyi [19], which was one of the first results on random graphs: 

Pcin^Ks) = + e - 

n \n 

In fact Erdos and Renyi proved even more: that ii p = (logn + c)/n, then the probability 
that Gn,p percolates in the i^s-process converges to e~^ " a.s n ^ 00. We remark that the 
same result holds for the C^-process, for any fc ^ 3, see Section [51 
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For r ^ 4, the problem is more challenging, since there seems to be no simple description 
of the closed sets under the i^r-process. Set 

The following theorem is our main result. 



Theorem 1. For every r ^ 4, there exists a constant c = c{r) > such that 

< Pcin,Kr) < n-^/^(^)logn 



n-'^/Mr) 



clogn 

for every sufficiently large n G N. 

In fact we shall prove slightly stronger bounds (see Propositions [3] and [S]); however, we do 
not expect either of our bounds to be sharp. The proof of the lower bound in Theorem [1] is 
based on an extremal result on graphs which cause a given edge to be infected (see Lemma[9]). 
Although it is not long, the proof of this lemma is delicate, and does not seem to extend 
easily to other graphs. The upper bound, on the other hand, holds for a much wider family 
of graphs H (see Section [2]), which we call 'balanced'. 

In the case r = 4 we shall prove the following stronger result, which determines pdn, K4) 
up to a constant factor. 

Theorem 2. If n is sufficiently large, then 

1 



^ Pc{n,K4) ^ 5 



Ayn log n y n log n 

The proof of Theorem [2] is completely different from that of Theorem [H and is based on 
ideas from two-neighbour bootstrap percolation on [n]*^. Although we have not done so, it 
seems plausible to us that one could prove a sharp threshold in the case H = K^. 

The rest of the paper is organized as follows. In Sections [2] and [3] we shall prove the upper 
and lower bounds in Theorem [1], respectively. In Section |4] we shall prove Theorem [2l and 
in Section [5] we shall discuss other graphs H, and state some open problems. 

2. An UPPER BOUND FOR BALANCED GRAPHS 

In this section we shall prove the upper bound in Theorem [H in fact we prove a stronger 
bound for a more general family of graphs, H. Throughout, we shall assume that v{H) ^ 4, 
since otherwise the problem is trivial. We make the following definition. 

Definition 1 (Balanced graphs). We call a graph H balanced if e{H) ^ 2v{H) — 2, and 

v{F)-2 ^ ^ v{H)-2 

for every proper subgraph F G H with v{F) ^ 3. 

It is straightforward to check that the complete graph Kr is balanced for every r ^ 4. 
Thus, the upper bound in Theorem [1] follows immediately from the following proposition. 
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Proposition 3. If H is a balanced graph, then 



2/X{H) 



log log n ^ 



for some constant C > 0. 

Note that X{Kr) = A(r) satisfies 



r , , , r + 1 
- ^ Mr) ^ 



if r ^ 4 (we sliall use tliese bounds several times during tfie proof), so Proposition [3] actually 
implies the following slightly stronger upper bound than that stated in Theorem [1) 



Pc{n,Kr) ^ n-^/^^"\\ogn) 



4/r 



We begin by sketching the proof of Proposition [31 We shall define for each c? G N a rooted 
graph, i.e., a pair (if^^, e) where Hfi is a graph and 66(2 '' ) is its 'root', with the following 
properties: 

(a) v{Hd) = {v{H) -2)d + 2 (6) e{H,) = {e{H) - 2)d + 1 (c) e G {H,)h. 

That is, if Hd occurs in Gn,p, then its root e is infected in the //-bootstrap process. 

To define Hd, choose a sequence of edges (ei, 62, . . .) of H, with ei = e, such that for every 
j G N, Cj and Cj+i do not share an endpoint. Let (Vi, V2, . . .) be a sequence of vertex sets with 
\Vj\ = \V{H)\ for each j G N, such that |V^inV^| = 2 if \i-j\ = 1, and |VjnVj| = otherwise. 
We remark that although the definition of Hd will depend on the choice of (ei, 62, . . .), the 
proof below will work for any such sequence. 

Definition 2 (The graph Hd). For each d E N, let {Hd,e) denote the rooted graph with 
root e = ei, vertex set Vi U . . . U Vd, and edge set E{Hd) = E{Hd[Vi]) U . . . U E{Hd[Vd]), such 
that 

Hd[Vj] =i7-{e„e,+i} 
and Vj n Vj+i = e^+i for every 1 ^ j ^ d — 1, and i/d[Vrf] = H — {e^}. 

In other words, we place a copy of H on each Vj, in such a way that Vj fl Vj^i = e^+i for 
each J G [c? — 1], and then remove the edges Ci, . . . ,ed. 

Observation 4. Properties (a), (b) and (c) hold for Hd- 

Let Xd{e) be the random variable which counts the number of copies of Hd in Gn,p, rooted 
at a given edge e. It is straightforward (using property (a)) to show that the expected value 
of Xd is large if p^^ ''n ^ (logn)^ (see Lemma [5]); the main challenge will be to bound 
the variance of Xd- The key step is therefore Lemma El below, which controls the number 
of edges in the intersection of two copies of Hd with the same root. Having bounded the 
variance of Xd-, the proposition follows easily by Chebychev's inequality. 

We begin by bounding the expected value of Xd{e). 
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Lemma 5. Let H be a balanced graph, and e G E{Kn). Ifp = p{n) and d = d{n) are chosen 
so that p'^^^^n ^ uv{H)d and uj^'"(^^~'^^'^ ^ n for some function oj = uj{n), and pn -^ oo, 
then 

E(Xd(e)) ^ oo 
as ra —7- oo. 

Proof. Recall that Hd has {v{H) — 2)d + 2 vertices and (e(if ) — 2)d + 1 edges. Thus 



Since e{H) — 2 = X{H){v{H) — 2), and using our bounds on u, it follows that 

/ \(H) \ ('"{H)-2)d 

E(X,(e)) ^ P(^J ^ p ■ c.(^(^)-^)'^ ^ pn ^ 00, 

as required. D 

Let root(ifd) denote the root of Hd, and recall that e{Hd) = {y{Hd) — 2)\{H) + 1. We 
shall next prove that the variance of Xd{e) is small; the following lemma is the key step. 

Lemma 6. Let H be a balanced graph, and let rf G N. If F C. Hd and Toot{Hd) C V{F), 
then 

e{F) ^ {v{F)-2)\{H). 

Proof. We shall use induction on d. The case li = 1 is trivial, since e = iooi{Hd) C V{F), 
so either v{F) = 2 and e(F) = 0, or v{F) ^ 3 and F + e C. H, and so the bound follows 
by Definition [H Let d ^ 2, and assume that the result holds for every d' < d. For each 
j G [d\, let V{Fj) = V{F) n Vj and let Fj = H[V{Fj)] be the subgraph of H induced by 
these vertices. 

Suppose first that v{Fj) ^ 1 for some j G [d], and let F' and F" be the subgraphs of Hd 
induced by V{F) fl (Vi U . . . U Vj^i) and V{F) fl (V,+i U ...UVd), respectively. Applying 
the induction hypothesis to F', we see that 

e(F) = e{F') + e{F") ^ {v{F') -2)\{H) + e{F"), 

so it will suffice to prove that e{F") ^ v{F")X{H). Now, applying the induction hypothesis 
to F*, the subgraph of H induced by V{F") U (Vj fl V^+i), we get either 

e{F*) ^ {v{F*)-2)X{H) ^ v{F")X{H) 

as required, or V{F*) = Vj+i U . . . U V^. But in the latter case e(F*) - e(F") ^ 1 (since 
v{Fj) ^ 1 and H is balanced), so 

e{F") ^ e(F*)-l = {v{F*) - 2)X{H) ^ v{F")X{H), 

as required. Hence we may assume that v{Fj) ^ 2 for every j G [d]. Moreover, a similar 
easy calculation proves the lemma if V{F) = V{Hd), so we may assume that v{F) < v{Hd). 
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Now, for each j G [rf — 1], let Ej denote the event that Vj fl Vj+i C V{F), and let 1[- 
denote the indicator function. Then, recalling that Fj = H[V{Fj)], we have 



<n ^ (X^e(F,))-l-2X;i[i?: 



j=i / j=i 



by the definition of H^, and since root(ifd) C V{F). We next claim that, since H is balanced 
and F ^ Hd-, it follows that 



d-l 



' d \ d-i 

To see this, observe that e{Fj) ^ (t;(Fj) -2)X{H) + 1 holds for every F^ C fT with v{Fj) ^ 2, 
by Definition [H and that Fj 7^ H for some j G [d], since f (F) < v{Hd). 
Finally, observe that 






d-l 



and so 

/ d-l \ d-l 

e(F) ^ t;(F) - d - 1 + J^ 1[F,] X{H) + (2d - 2) - 2 ^ 1[F,] 
But 



d-l \ d-l 



rf - 1 - ^ l[Fj] A(/7) ^ 2rf - 2 - 2 ^ l[Fj 
j=i / i=i 

since X{H) ^ 2, by Definition [TJ Hence 

e{F) ^ {viF)-2)X{H) 

for every F Q H, as required. D 

It is now straightforward to deduce the required bound on the variance of Xd{e). 

Lemma 7. Let H be a balanced graph, and e G E{Kn). If p = p{n) and d = d{n) are chosen 
so that v{Hd)~^p'^^^^n — )■ cx) as n — )■ 00, then 

Var(X,(e)) ^ ^ 
E{Xd{e)Y ^ 
as ra —7- 00. 
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Proof. Let i{Hfi) denote the number of copies of Ha, rooted at e, which share the same vertex 
set. Then 



^(■'^^<^" = Ur)-2)''™-p' 



Moreover, we claim that 



Var(X,(e)) ^^E^^^Q^.^/^ y p2eiH,)-MH)m _ (1) 

m=l ^ ^ 



viHd) -m-2 



To see this, we simply count (ordered) pairs {A, B), where A and B are copies of Hd in Gn,p 
with root e. Let F = Ar\ B and m = \V{A) fl V^(-B)| — 2, and note that we expect at most 
E(Xd(e)) such pairs {A, B) with m = 0. 

By Lemma El if A ^ S then e{F) ^ \{H)m, and so e(A U 5) ^ 2e(i:fd) - A(i/)m. 
Moreover, given m, there are at most 

/ \ / \ 2 

,. / n \ I r) \ 

iiHdY 



2 / ^ \ / ^ 



jnj \v{H(i) — m — 2 

choices for A and B. This proves ([1]). 

Combining the bounds above, and setting k = v{Hd), we obtain 

as ra — )■ oo, as required. D 

We can now deduce Proposition |3] using Chebychev's inequality and sprinkling. 

Proof of Proposition 0. Let if be a balanced graph, suppose that p ^ ( ^^°^^"^ j n^^/Kf^) ^ 

and let 

logn 

d{n) = - — 

[log logn 

We claim that p{n) and d{n) satisfy the conditions of Lemmas [5] and [71 Indeed, setting 
co{n) = d{n) we have p^^^^n ^ cod and co'^'^ ^ n, so Lemma [5] holds, and p^^^^n ^ d'^, so 
Lemma [7] holds. Thus, by Chebychev's inequality, 

/ . X X Var(Xrf(e)) 

^ ^ E(X,(e))^ 

as n — 7- oo. Moreover, if Xrf(e) 7^ then e G {Gn,p)H, since if e is the root of some copy of 
Hd then it is infected after at most d steps of the ii-process. Hence, by Markov's inequality, 

if p^i^)n ^ ( jq°q"„ ) then, with high probability, all but o(ra^) edges of Kn are infected in 

the ii-process on Gn^p. 

To finish the proof, we shall show that by sprinkling 0{n logn) extra edges, we shall infect 
all of the remaining edges, with high probability. We use the following easy claim. 

Claim: If e(^{Gn,p)H) ^ Q) ^ o(n^), then there is a clique of size n — o{n) in G = {Gn,p)H- 
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Proof of Claim. Let < c < 1 be arbitrary. By the pigeonhole principle, there are o{n) pairs 
{x,y} such that dcix) + dc{y) < (2 — c)n. We claim that every other edge is in G. 

Indeed, if ddx) + dciy) ^ (2 — c)n then, by Turan's Theorem, there is a {y{H) — 2)-chque 
in Ng{x) n Nciy), since o(?t,^) edges are missing. But then xy G (G), which implies that 
xy G G, since G is closed under the if -bootstrap process. Thus e{G) ^ (2) — o(n), and so 
the claim follows. D 

Finally, let us sprinkle edges with density p; that is, let us take a second copy of Gn,p and 
consider the union of the two random graphs. We obtain a random graph Gn,p* of density 
p* = 1 — {1 — p)^ < 2p. Let K be the clique found in the claim, and observe that if every 
vertex outside K has at least v{H) — 1 neighbours in K (in the second copy of G„.p) then 
Gn,p* will percolate. Since pn ^ logn, this occurs with high probability, and hence 

\ log logn/ 
if C is sufficiently large, as required. D 

3. Lower bound for i^^-PERCOLAxioN 

In this section we shall prove the following proposition, which shows that, if r ^ 4 and 
(p\ogn)'^^'^^n ^ 1, then with high probability o{n'^) edges are infected in the Xr-bootstrap 
process with initial set Gn,p- 

Proposition 8. Let r ^ 4, and let e G E{Kn). If pn^^^'^^^ logn ^ l/(2e), then 

P(e G {Gn,p)Kr) -^ 

as ra — 7- 00. 

The idea of the proof is as follows. If e G {G)Kr for some graph G, then there must exist 
a 'witness set' of edges of G which caused e to be infected. We shall describe an algorithm 
which finds such a set F = F{e) of edges, and show that this set has two useful properties: 

(a) e(F) ^ X{r){v{F) - 2) + 1 (see Lemma [9]). 

(6) If e(F) ^ QL, then L ^ e(F(/)) ^ QL for some / G (G)^. (see Lemma[l3]). 

Property (a) will allow us to bound the expected number of such sets when G = Gn,p and 
e{F) = O(logn); combining it with property (6) will allow us to do so when e{F) is larger 
than this. 

3.1. Extremal results. Let r ^ 4 be fixed for the remainder of this section, and let G be 
an arbitrary graph. We begin by describing the algorithm which finds F{e). 

The Witness-Set Algorithm. We assign a graph F = F{e) C G to each edge e G {G)Kr 
as follows: 

1. If e G G then set F(e) = {e}. 

2. Choose an order in which to infect the edges of (G)i^^, and at each step identify which 
r-clique was completed (if more than one is completed then choose one). 
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3. Infect the edges one by one. If e is infected by the r-clique K, then set 

F(e) := U F(e'). 

We call the graph F{e) a witness set for the event e G {G)Kr- 

Since every e 7^ e' G -ft' is either in G, or was infected earlier in the process, the algorithm 
is well-defined. Note that the graphs F{e) depend on the order in which we chose to infect 
the edges (that is, they depend on Step 2 of the algorithm); the results below hold for every 
possible such choice. 

We shall say that a graph F is an r- witness set if there exists a graph G, an edge e, and 
a realization of the Witness-Set Algorithm (i.e., a choice as in Step 2) such that F = F{e). 
The key lemma in the proof of Proposition [S] is the following extremal result. 

Lemma 9. Let F be a graph and r ^ 4, and suppose that F is an r -witness set. Then 

e{F) ^ X{r){v{F)-2) +1. 

We shall prove Lemma [3 using induction; in order to do so, we shall need to state a more 
general version of it (see Lemma [TOl below). The statement is slightly technical, and we shall 
need some preparatory definitions. We shall use the following algorithm, which is simply a 
restatement of the Witness-Set Algorithm. 

The Red Edge Algorithm. Let G be a graph, let r ^ 4, and let e G {G)Kr \ G. 

1. Run the Witness-Set Algorithm until edge e is infected. 

2. Let (ei, 62, ... , e^) be the edges infected in the process with F{ej) C F{e) and Cj ^ G, 
written in the order in which they are infected, where Cm = e. 

3. For each 1 ^ j ^ m, let K^^^ be the r-clique which is completed by Cj. 

4. Colour the edges {d, . . . , e^} red, and note that Cj G K^^^ \ (iT^^) U . . . U K^^''^^). 

The key observation is that F{e) = [K^^^ U . . . U if (™)) \ |ei, . . . , Cm}, or, in words, F{e) 
consists of all the non-red edges of the cliques. We shall bound the number of non-red edges 
after t steps of the Red Edge Algorithm. Thus, given a realization of the algorithm and 
t G [m], define 

B, := (ir«U...UifW)\{ei,...,e,}. 

We shall only use the following properties of the Red Edge Algorithm: at step j an r-clique 
is added, one of the edges Cj of K^^^ is coloured red, and Cj ^ K'^^^ for every i < j. 

In order to state Lemma [TOl we need to define two more parameters of the model, which 
will both play a key role in the induction step. 

Definition 3 {i and k). Let Qt denote the graph, obtained using the Red Edge Algorithm, 
whose vertices are the cliques {K^^\ . . . , i^^*-*}, and in which two cliques are adjacent if they 
share at least two vertices. 

Let i = it denote the number of components of Qt, let c(f ) = Ct{v) denote the number of 
components of Qt containing the vertex v G V{G), and set 



10 



JOZSEF BALOGH, BELA BOLLOBAS, AND ROBERT MORRIS 



Here, and throughout, we treat Ci as a component in Qt-, and also as a subset of V{G), and 
trust that this will not cause confusion. 

The following lemma easily implies Lemma IH since when t = m we have i = 1 and hence 
k = 0. 



Lemma 10. e{Bt) ^ 



Q 



v{Bt) + k-ir] + 



1 



We shall prove Lemma [TUJ by induction on t. The induction step will be relatively straight- 
forward when it ^ it-i', when it < it-i we shall need the following lemma. 

Say that a (multi-) family of sets ^ is a double cover of X if every element of X is in at 
least two members of A. 

Lemma 11. Let m ^ 2 and r ^ 4, and let A be a multi-family of subsets of [m]. If A is a 
double cover of [m], and \A\ ^ r, then 



{{A,B}e(^) : AnBy^Hl] ^ A(r)(5^|A| -27^] 



m. 



(2) 



Proof of Lemma [ill We shall use induction on m. Suppose first that m = 2, and let A 
consist of X sets of size two and y sets of size one. If x ^ 2, then we have 



2+^^+2 



X -\- y 



X{x + y){x + y - 2) + 2 ^ A(r) {2x + y - A) + 2, 



since 2x + y ^ A and x + y ^ r. Similarly, if a; = 1 then y + (^ 2 ) ^ -^('")(^ — 2) + 2 for every 
3 ^ y ^ r — 1, and if x = 0, then 1 + (^~ ) ^ Xii")iy — 4) + 2 for every A ^ y ^ r. 

So let m ^ 3, and let ^ be a multi-family as described, let T = {A E A : m E A}, and 
apply the induction hypothesis to the multi-family A' obtained by removing m from each 
element of T. Letting t = \T\, assume first that t < r. This gives 



{{ABjet^-^y. AnB^^} ^ {{ABjel^-^y-AnB^d)} 

\AeA' 

= A(r) l^\A\ - 2m] + 
\A€A J 



- 2m + 2 + (m - 1) + 



(m — 1) 



(^)-i 



A(r)(t-2), 



t-2 



^, and A(r) ^ | if r ^ 4. 



so it will suffice to show that A(r)(t — 2) ^ (*) — 1. But 
Hence we are done unless t = r. 

Finally, suppose that t = r. Then the left-hand side of (Ej) is equal to (2), and the 
right-hand side is at least 



A(r)(r + 2{m - 1) - 2m) + 



m 



2 + m ^ 
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since ^ is a double cover of [m] and m ^ 2. The induction step, and hence the lemma, 
follows. n 

In fact, the following reformulation of Lemma [11] will be more convenient for us in the 
proof below. Here No = {0, 1,2,.. .}, and V{m) denotes the non-empty subsets of [m]. 

Lemma 12. Let m ^ 2 and r ^ 4. Given any function a : V{m) — )■ No such that ^^ as ^r 
and Ylis^j '^5^2 for every j G [m], we have 

E ("2") + 5Z ^sar ^ X{r)( ^ as\S\ - 2m\ + m, (3) 

Proof. We apply Lemma [TT] to the multi- family A which contains exactly as copies of 5* for 
each S C [m]. The condition J2s5j '^s ^ 2 implies that ^ is a double cover, and J2s ^s ^ f^ 
implies that |^| ^ r. Thus (J2]) holds, which is clearly equivalent to ([3]). D 

We can now deduce Lemma [TOl 

Proof of Lemma UR We shall prove the lemma by induction on t. When t = 1 we have 
v{Bi) = r and e(-Bi) = (2) — 1. Clearly £1 = 1 and ki = 0, and 

e{B,) = (2)-! = Mr)H5i)-r) + Q-l' 

so in fact equality holds. For the induction step we divide into three cases. Let t ^ 2, and 
assume that the lemma holds for smaller values of t. 

Case 1: it = it-i + 1- 

Since Qt has one more component than Qt-i, it follows that K^^^ intersects every other 
clique in at most one vertex. Hence all of the edges of i^'-*-' are new, and so 

e{Bt) = e(i?,_i) + Q - 1- 

Now let b be the number of vertices of K^^^ which are not new, and hence intersect other 
components of M. Then v{Bt) = v{Bt-i) + r — h and kt = kt-i + 6, so, by the induction 
hypothesis for t — 1, 

e{Bt) ^ (vry) (^(^*-i) + ^*-i - ^t-ir) + (£,_! + 1) (^Q - 1) 

Q-2^ 



v{Bt) + kt-f.tr]+(.t[ (2) -1 



as required. 
Case 2: it = it-i- 
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Since Qt and Qt-i have the same number of components, it follows that K^^'^ must intersect 
some component, Ci, in at least two vertices, and intersects every clique not in Ci in at most 
one vertex. Thus, the only edges of K^*^ which are not new have both endpoints in Ci. Hence, 
letting a = \K'^^'> H Ci|, we have 

e{B,) ^ e(i?,_0 + Q - (2 

Now, let b be the number of vertices of K^^^ \ Ci which are not new, and hence intersect 
other components of Qf Then v{Bt) = v{Bt-i) + r — a — b and kf = kt-i + b, so, by the 
induction hypothesis for t — 1, 



If a < r — 1 then 



;)--0-(-«)(f^i>°' 



since the worst cases are the extremes (a = 2 and a = r — 1), and using the fact that r ^ 4. 
But if a = r, then our bound on e{Bt) can be improved to e{Bt) ^ e{Bt_i) (which is trivial 
since we are not allowed to colour edges of Bt-i red), and v{Bt) = v{Bt-i), so we are done 
in this case as well. 

Case 3: it < it-i- 

This case is more difficult, and we shall need to use Lemma fT2l Set m = it-i — ^t + 1, 
and observe that m ^ 2, and that K^^' intersects m components Ci, . . . , Cm in at least two 
vertices each, and intersects every clique not in these components in at most one vertex. 
Define, for each S G V{m), 

as = |{t;GirW : veCj ^ J e S}\, 

and set 

e{A) = J2 ("2") + E ^^«^' 
ser{m) ^ ^ {s,T}ej 

where J = {{S,T} e (^f*^) : 5 nT ^ 0}, as in Lemma [H Then 

e{Bt) ^ e(i?,_0 + Q - eiA) - 1. 

Set a = J2sev(m) '^s^ ^^^ ^^^ ^ denote the number of vertices of K^^^ \ (Ci U . . . U Cm) 
which intersect other components of Qf Then v{Bt) = v{Bt-i) + r — a — b. Also, let 
c = J2s€V(m) ^s{\S\ — 1), and observe that kt ^ h-i + b — c. 
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e{A) 



Thus, by the induction hypothesis for t — 1, 

> ( -TZy) {""^^'^ + A;t - V - mr + a + c) + (£* + m) ( T ) " M " ^(^)- 
Note that a ^r and Xlssi '^s ~ l-^*'*' '"' Ql ^ 2. Hence, by Lemma [T2| 

e(v4) ^ A(r) ( ^ asl^l - 2m j + m = A(r) (a + c) - m ( A(r)r - ( ^ J + 1 j , 

since 2A(r) — 1 = A(r)r — (2) + 1- Thus 

as required. This completes the induction step, and hence the proof of the lemma. D 

For completeness, let us quickly note formally that Lemma [9] follows immediately from 
Lemma [TOl 

Proof of Lemma O Let F be an r- witness set for the graph G and the edge e, and run the 
Red Edge Algorithm. We claim that the graph Qm is connected. To see this, consider the 
component C of Qm which contains the edge e, and run the process backwards. A little 
thought reveals that every edge of F{e) must lie in some clique in C, and so the component 
C must span the entire graph Q^-, as claimed. 

Thus C-rn = I5 and so c{v) = 1 for every v G V{F), which means that k^ = 0. Hence, by 
Lemma (TUl 

e(F) ^ ( VT^) {^(F) - ") + ( (2) - 1) = ^^") (^(^) - 2) + 1, 

as required. D 

3.2. Bootstrap methods. To deduce Proposition [HI we shall borrow only one simple idea 
from the theory of bootstrap percolation. The following lemma is based on an idea of 
Aizenman and Lebowitz [3]. 

Lemma 13. Let F be an r -witness set on a graph G, and let L eN. If e{F) ^ (2)-^? then 
there exists an edge f G E{G) with 



L ^ e(F(/)) ^ Ql 



in the same realization of the Witness-Set Algorithm. 

Proof. Run the Witness-Set Algorithm, and observe that the maximum size of e{F{f)), over 
all infected edges, increases by at most a factor of (2) at each step of the process. It follows 
immediately that a graph F{f) as described must have been created, at some point in the 
process. Moreover, such a graph exists with F{f) G F. D 
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We have finally finished with our deterministic preliminaries, and it is time to reintroduce 
randomness. There is, however, little left to do: the bound we require will follow easily from 
Lemmas [9] and [13] by Markov's inequality. 

For each m eN and every e G E{Kn), let 



Yrr^ie) 



I.S C[n] : eC S, and e(G„,p[S]) ^ m ^ A(r)(|5| - 2) + l| 



be the random variable which counts the number of sets 5* which contain e, and also at least 
m ^ A(r)(|S'| — 2) + 1 edges of Gn,p- We first bound the expected size of Ym{e). 

Lemma 14. For every r ^ 4, there exists a C{r) > such that the following holds. IfnEN 
and p > satisfy pn^^^^"^^ \ogn ^ l/(2e), and n is sufficiently large, then 



ny,.ie)) < ."■^^w^"'-^'" 



2Q\ogn, 

for every e G E{Kn) and every A(r) + 1 ^ ?Ti ^ (2) logn. 
Proof. Let £ G N be maximal such that m ^ A(r) (£ — 2) + 1. Then l^l ^ i, and hence 

^ ""^ ^^ V^-2y V ^ / (^-2)! V2my V 2m J 



Note that 1 ^ m - A(r)(£ - 2) ^ A(r), and that jjZiyX^) < 1 for every a G [A(r)], 
assuming n is sufficiently large. Thus it suffices to observe that 

f m + C{r) 

if C{r) is sufficiently large, since m(^m + C(r)) ^ (m + 2A(r)) ^ (A(r)£) ^ ^-^. D 

We can now easily deduce Proposition [HI 

Proof of Proposition\^ Let r ^ 4, let n G N, and let p = p{n) > satisfy pn^/^^''"'' logn ^ 
l/(2e). We claim that, for every e G E{Kn), 



P(e G (G„,p);,,) ^ 



as n — )■ cxD. Indeed, suppose that e G (Gn,p)x^, run the Witness-Set Algorithm, and consider 
the graph F = F{e) C Gn,p- 

Suppose first that e{F) ^ (0 logn. By Lemma [HI we have 

e(F) ^ A(r)(i;(F)-2) + l, 

and thus either e G Gn,p, or Fmle) ^ 1 for some 

A(r) + 1 ^ m ^ I J logn. 
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By Lemma [HI this has probabihty at most 

P + J: E(y;„(.)) < p . ^ g±^ -. 0, 

m=A(r)+l m=A(r)+l ^ ^ ^2^ ^"8 '^^ / 

as n — )■ oo, as claimed. 

So suppose next that e(F) ^ (Q logri. By Lemma [T3l there must exist an edge / in Kn 
such that logn ^ e{F{f)) ^ (2) logn, which means that Ym{f) ^ 1 for some logn ^ m ^ 
(2) fog n. By Lemma [HI the expected number of such edges / is at most 



2/ V VsQfogriy Vr , 

' m=log n ^ "^'i'' '-' ' N - 

as ra — > cxD, since r ^ 4. This proves the proposition. D 

We finish by noting that Theorem [1] foUows immediately from Propositions [3] and [H] 
Proof of Theorem [H By Proposition [3l we have 



Pc{n,H) < (logn) 



2/\{H) yx{H) 



n 



for every balanced graph H. Moreover K^. is balanced, since 

r — 3 r — 2 

for every r ^ 4, and A(-ft'.f) = A(r) ^ 2, so the upper bound follows. 

For the lower bound, suppose that pn^'^^''' logn ^ l/(2e), and n is sufficiently large. By 
Proposition [HI we have 

P(e G {Gn,p)K}j -^ 

for every edge e G E{Kn). Thus G„,p does not percolate, with high probability, as required. 

D 

4. The threshold for _ft'4-PERCOLATION 

In this section we shall prove Theorem[21 which determines the threshold for fC4-percolation 
on Kn. The proof is quite different from that of Theorem [H and uses ideas from the study 
of 2-neighbour bootstrap percolation on [rif; see in particular [3l [U [2l] . 

We begin with a simple but key observation. A hypergraph "H is said to be triangle-free if 
there do not exist distinct vertices fi, f2, fs and edges ei, 62, 63 G E{l-L) such that vi G 62063, 
f2 G 61 n 63 and fs G 61 fl 62. 

Observation 15. For every graph G, the graph {G)k4, consists of a collection of edge- disjoint 
cliques. Moreover, the hypergraph defined by these cliques is triangle- free. 
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Proof. For the first part, simply note that if two cliques Ri and R2 share more than one 
vertex, then the closure (i?i U R2)r4, is a clique on vertex set ^(-Ri) U V{R2). To prove the 
second part, observe that if i?i, R2 and i?3 form a triangle, then the closure (i?i Ui?2 ^Rz)r4, 
is a clique on vertex set V{Ri) U V{R2) U V{R^). U 

Say that a clique K is internally spanned by a graph G if {GnK)K4, = K. We shall study, 
for each £ G N and p > 0, the probability 



P{i,p) := IP(-ft"^ is internally spanned by Gn,p 
The following bounds are both straightforward. 
Lemma 16. For every 3 ^ £ G N and p G (0, 1) with p(? ^ 1, 



2i 



Proof. For the lower bound, simply count the graphs on vertex set [i], and with 2i — 3 edges, 
in which every vertex j ^ 3 sends two edges 'backwards' in the order induced by Z. It is 
easy to see, by induction on t, that the clique Kf with vertex set [t] is internally spanned, 
for each t G [i]. The number of such graphs is 



^ 2 / " 2^P " (2e^Y ' 

by Stirling's formula, and each is an induced subgraph of Gn,p with probability at least 
p'^^^^{l—pY ^ p'^^~^/2tt. Since these events are mutually exclusive, the lower bound follows. 
For the upper bound, recall that if a graph G internally spans Ki, then e{G) ^ 2i — 3. 
(This was first proved in [13]; see also Lemma [TS] for a short proof.) It follows that 



p(^,p) < iS^^i^" « (i^Y'^'i^pr" < (i^riiYi^p) 



2^-3 

as required. D 

4.1. The lower bound. The following lemma, like Lemma [T3| it is based on an idea of 
Aizenman and Lebowitz [3], who proved the corresponding result in the context of two- 
neighbour bootstrap percolation on [uY. The lower bound in Theorem [2] will follow by 
combining it with Lemma [T6l 

Lemma 17. Suppose that {G)k4, = Kn, and let 1 ^ L ^ n. There exists a clique K C Kn 
which is internally spanned by G, with 

L ^ v{K) ^ 3L. 

We remark that this result does not generalize to if^-percolation for r ^ 5. In fact, it is 
not hard to construct a graph G for which {G)Kr = Kn, but no clique Kg with r < i < n is 
internally spanned. 

In order to prove Lemma [T71 we shall introduce a simple algorithm for filling Kn, which we 
call the Clique-Process. This algorithm will also provide a short proof of the bound e{G) ^ 
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2£ — 3 for a graph G which internally spans Kg. It is analogous to the 'rectangle process' 

in two- neighbour bootstrap percolation on [nY (see Proposition 30 of [25] or Theorem 11 

oflU). 

The Clique Process. Let G be a graph on n vertices, and run the i^4-process as follows: 

0. At each step of the process, we will maintain a collection {Ri,Ai),...,{Rm,Am), 
where Rj is a clique and Aj C E{G), such that {Aj)k4, = Rj for each j G [m]. 

1. At time zero, set Rj = Aj = {cj} for each j G [m], where E{G) = {ei, . . . , Cm}. 

2. At time t G 2Z, choose a pair {i,j} such that \Ri (1 Rj\ ^ 2, if such a pair exists. 
Delete (i?j, Ai) and (i?^, A^), and replace them with {{Ai U Aj)^^, Ai VJ Aj). 

3. At time t G 2Z-I-1, choose a triple {^, j, fc} such that i?,, -Rj, and Rk form a triangle in 
the hypergraph defined by the cliques, if such a triple exists. Delete (i?j, Ai), {Rj, Aj) 
and {Rk, A^), and replace them with {{Ai U Aj U Afc)x4, ^i U Aj U A^). 

4. Repeat steps 2 and 3 until all edges of {G)k4, are infected. 

The algorithm terminates by the proof of Observation [151 Observe moreover that the Aj 
are in fact disjoint sets of edges of G. We can now prove Lemma [T71 

Proof of LemmalTTl Suppose that {G)k4, = Kn, and run the Clique Process for G. At each 
step of the process, the value of max^g^m] "^l-Rj) increases by a factor of at most three. Hence, 
for every L G [n], there exists a clique K = Rj G Kn, with 

L ^ v{K) ^ 3L, 

which is internally spanned by G, as claimed. D 

We can also easily deduce the following bound, which was first proved by BoUobas [13] . 

Lemma 18. If G internally spans K^ then e{G) ^ 2£ — 3. 

Proof. We shall use induction on i; for £ ^ 3 the result is trivial. Now suppose that G 
internally spans R = Ki, and run the Clique Process for G. At the penultimate step we 
have either two or three disjointly internally spanned proper sub-cliques of R, which together 
span R. If these cliques are S = {A)^^ and T = {B)^^, and An B = ^ then 

e{G) ^ e{A) + e{B) ^ 2{v{S) + v{T)) - Q ^ 2v{R) - 2, 

since |5 n T| ^ 2, so v{S) + v{T) ^ v{R) + 2. If they are S = {A)k^, T = {B)k^ and 
U = {C)k4,, with A, B, C pairwise disjoint, then we have 

e{G) ^ e{A) + e{B) + e{C) ^ 2{v{S) + v{T) + v{U)) - 9 ^ 2v{R) - 3, 

since (5*, T, U) form a triangle, so v{S) + v{T) + v{U) ^ v{R) + 3. D 

We can now prove the lower bound on Pc{n, K4) in Theorem [2J It follows easily from 
Lemmas [16] and [TTj, using Markov's inequality. 

Proposition 19. If p^nlogn ^ 16/(9e'^), ^/ien 
as ra —7- 00. 
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Proof. Let p'^nlogn = 16/(9e^) and L = logn. By Lemma [T7t if {Gn,p)Ki = Kn then there 
exists an internally spanned clique R with L ^ v{R) ^ 3L. By Lemma fT6l the expected 
number of such cliques is at most 



Thus 



A) ^ ^' ^ ^^-^ \ipj\ 16 J ^ -^^ Ke^logn 






as n — )• oo, as required. D 

4.2. The upper bound. We shall use the second moment method (and Lemma [T6|) in order 
to show that (?„ p internally spans a clique of order ~ log n with high probability. We will 
then deduce the upper bound in Theorem [2] using sprinkling. 

Let X{i,p) denote the random variable which counts the number of copies of Ke which 
are internally spanned by G„,p. We first bound the expected value of X{i,p). 

Lemma 20. For every nGN, 3^£gN and p G (0, 1) with pC"^ ^ 1, 
Proof. By Lemma [121 we have 



2 /)\ ^ / 1 \ 3 



E{xii,p)) ^ (;)(^)'(^i>)"-^ ^ 



2e2 
as required. D 

To bound the variance of X{i, p), we shall use the following extension of Lemma [T51 Given 
cliques S C R, let 

D{S,R) := {((G„,pU5)ni?>^^ = i?} 

denote the event that R is internally spanned by Gn,p U S. Lemma [TS] is equivalent to the 
case ^^(5') = 3 of the following lemma. 

Lemma 21. IfD{S,R) holds, then e{{Gn,p\ S) f] R) ^ 2{v{R) - v{S)) . 

Proof. We shall use induction on £ = v{R). Suppose that D{S,R) holds, and apply the Clique 
Process, except starting with the clique S already formed. Suppose at the penultimate step 
we have two disjointly internally spanned cliques, T = {AU S)k4, and U = {B)k^, where 
An S = Ar\ B = ^. (That A and B may be taken to be disjoint follows by the comment 
after the Clique Process.) By the induction hypothesis, we have 

e{{Gn,p \S)r\R) ^ e{A) + e{B) ^ 2{v{T) - v{S)) + 2v{U) - 3 > 2{v{R) - v{S)), 

since |T fl ?7| ^ 2. The case of three cliques is similar, so we shall skip the details. D 

We can now bound the variance of X{i,p). Let P{k, i) = F(^D{Kk, K^)^. 
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Lemma 22. Let n G N, 4£ ^ logn and p^n log n ^ l/4e. Then 

Var(X(£,j9)) < E{X{i,p)Y 
as n —!• oo. 

Proof. We first claim that 

Var(X(£,p)) ^ E^(^(^'P))(n(/ J^(^'^)- 

k=2 \ / \ / 

This follows by considering ordered pairs {S, T) of internally spanned ^-cliques which intersect 
in a fc-clique. By Lemma UTl if D{S fl T,T) holds then there are at least 2{i — k) edges of 
Gn^p in T \ S, and so 



-<«) * I V-TV-' < F^ 



Thus, by Lemma 



P(kJ)< ('^^)""'(j^) Crt'ElXl^.p)). 



But, using the fact that {i + k) ^ e (£ — k) , an easy calculation gives 
i\f n \ f eii + k)py^'-''^ f 2e^ Y ^ ( f\' f ^ 



,2 



JvJ\C, — kJ \ A J \p'^nij v4/ \8kp'^n^ 

Hence, using the fact that {1/Cx)^ ^ gi/Ce ^^^^ ^^^^ ^^^ ^ ^-1/2-0(1)^ ^^ obtain 

Var(X(£,p)) < E(X(£,p))^(£p)^(^yX:(^)' « E(X(£,p))^ 

as required. D 

Using Chebychev, and sprinkling, we can now deduce the following bound on Pc{n, K4). 
Proposition 23. If p^nlogn ^ 18, then 

05 n — 7- 00. 

Proof. Set Ai = logn and p^n logn = 1 > l/4e, and observe that the conditions of Lemmas 120) 
and 122] are satisfied. By Lemma [201 we obtain 

.2^/?\^ / 1 \ 3 / 1 \ l°g"/4 



^™ > (^) (i) > (^) 



^3/2-0(1) ^ ^^ 



Thus, by Lemma |22] and Chebychev's inequality, with high probability there exists a copy 
of Ki which is internally spanned by H := Gn,p- 

Now set pj = 2~^^^p for each j G N, and let Gj = Gn,pj be independent random graphs 
with density pj. We make the following claim. 
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Claim 1: Let e > be sufficiently small. If 2^-'~^£ = t ^ en, then {Gj U Kt)K4, contains a 
clique of size 4t, with probability at least 1 — e~*/^. 

Proof of Claim 1. Observe that every vertex v that has at least two neighbours in Kt (in the 
graph Gj) is added to the clique in {Gj U Kt). It therefore suffices to show that there are 
at least 3t such vertices, with high probability. The expected number of such vertices is at 
least 

since pjt = 2^+^ip = SpVii = 0{^/6). 

This event (having two neighbours) is independent for each vertex. Thus, by Chernoff' s 
inequality, with probability at least 1 — e^*/®, the number of such vertices is at least 3t, as 
required. D 

We apply the claim for each j ^ 0. It follows that, with high probability, {HU[JJLi ^i)^4 
contains a clique of order en, for some e > 0. Finally, let H' be another independent copy 

of Gn,p- 

Claim 2: If t ^ en, then {H' U Kti^ = Kn with high probability. 

Proof of Claim 2. We apply the same argument as in the proof of Claim 1. Indeed, the 
probability that a vertex v has at most one neighbour in Kt is at most 

[l-pf + tpil-pf-^ ^ (l + 2tp)e-*P < — 

as n — i- oo. Hence, by Markov, the probability that there exists such a vertex is at most 1/n, 
as required. D 

To complete the proof, we simply note that the graph G = HU H' U IJ^i ^j i^ ^ random 
graph Gn,p* of density 

oo 

p* !^ 2p + Y^ 2-^+^p ^ 18p, 
i=i 
and G„,p* percolates in the i^4-process with high probability, as required. D 



Theorem |2] follows immediately from Propositions [19] and | 

5. Other graphs, and open problems 

In this section we shall mention some simple results for graphs other than Kr, and state 
several of the many open problems relating to this model. Since the results will all be fairly 
straightforward, we shall only sketch the proofs. We being by stating a simple extension of 
the (trivial) result for the i^3-process mentioned in the Introduction. 

Proposition 24. Let H = Gk for some k ^ 3, or H = -ft'2,3- Then, 

Pc{n,H) = h e - 

n \n 
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Sketch of proof . We shall show that, with high probability, the graph Gn,p percolates in the 
//-bootstrap process if and only if it is connected. The bounds on pc{n,H) then follow by 
standard results, see [Ti] . 

Indeed, first let H = Ck and consider a path of length at least k attached to a triangle; we 
claim that this graph spans a clique (on its vertex set). To see this, identify the vertices with 
[£] so that the edges are {i{i + 1) : i G [^— 1]} U {13}, and say that ij is a t-edge if |i — j| = t. 
The edges are infected in the following order: {k — l)-edges, fc-edges, 2-edges, 3-edges, 4- 
edges, and so on. Finally, observe that if Gn^p is connected then, with high probability, every 
vertex has a path of length at least k leading to a triangle. 

For H = 7^2,3 the proof is similar. Let x,y E V{Gn,p), and suppose that there exist vertex 
disjoint paths from x and y to opposite corners of a copy of C4. Then it is easy to see that 
the percolation process works its way along these paths and eventually infects the edge xy. 
This gives a large complete bipartite graph, and if there is an edge in each part then the 
closure is a complete graph. Since Gn,p is connected, every vertex is eventually swallowed 
by this clique. D 

The case H = i^2,3 is the first we have seen for which pc{n, H) ^ n~^'^^^''^°^^' . We shall 
now determine a large family of such graphs. Define 

e(F) 



X*(H) := min max , , , 

eGE{H)FcH-e {v{F) 

This parameter gives us a general lower bound on Pc{n, H). 

Proposition 25. For every graph H , there exists a constant c{H) such that 

Pcin,H) ^ c{H)n-^'^*^"^ 

for every n G N. 

Sketch of proof. We shall show that if p ^ c{H)n~^/^'^^^ then, with probability at least 1/2, 
no new edges are infected in the if-bootstrap process. To do so, for each e G E{H) choose a 
subgraph F = F{e) G H — e which maximizes e{F)/v{F), and note that e{F)/v{F) ^ X*{H). 
Thus, the expected number of copies of F in Gn,p is at most 

J^?)P'^''^ ^ c(i7)n''(^)-^(^)/^*(^) ^ c{H). 

Summing over edges of H, we obtain 

P(^F(e) C Gn,p for some e G E{H)) ^ e{H)c{H) < -, 

if c{H) is sufficiently small. But if F{e) ^ Gn,p for every e G E{H) then H — e (/i Gn,p for 
every e G E{H), and hence no new edges are infected, as claimed. D 

We next show that Proposition [25] is sharp for a large class of graphs H. 

Proposition 26. If H has a leaf, then 

Pc{n,H) = e (n-^/^*(^)) . 
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Sketch of proof. The lower bound follows from Proposition [221 For the upper bound, let p ^ 
^-i/x*(H) ^^^ recall (see [14J) that, with high probability, H — e G Gn,p for some e G E{H). 
(To see this, let e and F C H-e be such that e{F)/v{F) = maxp>cH-ee{F')/v{F') = X*Ih), 
find a copy of F in Gn,p by the second moment method, and then find if — e by sprinkling.) 
Let f 1 be the neighbour of a leaf in H, and observe that we can infect every edge which is 
incident with v (and is not in our copy of if — e). 

Now, take a second, independent copy of G„,p, and apply the same argument inside the 
neighbourhood of Vi. We find a vertex V2 such that we can add (almost) all edges incident 
with V2- Repeating this process v{H) times, we find (with high probability) a clique on v{H) 
vertices in {Gn,p*)H, where p* = v{H)p. 

Finally, observe that {K^(^h))h = K^, since we may add the remaining vertices to the 
clique one by one. Thus p = 0(n^^^^*^^^), as claimed. D 

A slightly less trivial case, which lies somewhere between a clique and a tree, also matches 
the general lower bound in Proposition [2S1 Say that H is an r-clique-tree if (for some 
2 ^ i* G N) it is composed of i disjoint copies of Kr, plus i — 1 extra edges, and is connected. 

Proposition 27. Let H he an r-clique tree. Then 

for some c{H) > 0. 

Sketch of proof. The lower bound again follows by Proposition [231 For the upper bound, 
first observe that 

where v{H) = ir. Assume first that £ ^ 3, and let p ^ n^^/^*(^) (we shall prove a stronger 
result in this case). Note that, as in the previous proof, H — e C Gn,p for some e G E{H) 
with high probability; in fact, there exist at least v{H) copies of if — e. Moreover, setting 
e = ^, there exist at least n"^ copies of K^ in Gn,p- Let X denote the union of those copies 
of Kr which do not intersect a copy of if — e. 

From each copy of if — e, pick a clique R which is the neighbour of a leaf (in the tree- 
structure of H), and observe that we may infect every edge between R and X. We thus 
obtain a complete bipartite graph, with parts of size v{H) and n"^. Moreover, each part 
consists of r-cliques, and thus these edges span a clique on the same vertex set. 

Finally, sprinkling edges with density p, we see that every vertex in a copy of Kr minus an 
edge, and with a neighbour in X, is added to the clique. With high probability there are n^^ 
such vertices. Repeating this process 1/e times, we infect the entire edge set, as required. 

For the case i = 2 we prove the weaker bound in the statement. Let p be as above, and 
take logn copies of Gn^p. In the first we span a clique of order C, for some large constant C; 
in the second a clique of order C^; in the third C^, and so on. In the first step this is just the 
union of copies of Kr] in later steps it is the union of copies of Kr minus an edge which have 
a neighbour in the clique formed in the previous step. The proposition now follows. D 
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We give one final cautionary example, whose purpose is just to point out that X{H) and 
X*{H) are not the only possible values of 

logn 
-hmsup 

rt-s-oo \.ogpc[n,n) 

Let DDr denote the 'double-dumbbell', the graph consisting of two disjoint copies of Kr, 
plus two extra (disjoint) edges between the two cliques. Note that \{DDr) = r/2 and 
\*{DDr) = (2(0 + l)/2r, and therefore 

n +1 

\*{DDr) < ^ < \(DDr.). 

r 

Proposition 28. For every r ^ 4, 

Sketch of proof . The key observation is that ii H = DD^ and e G E{DDr), then {H — e)H = 
K\H\, i-e., a copy of DDr spans a clique on its vertex set. Moreover, two (^ 2r)-cliques which 
overlap in two (or more) points span a clique on their union. We shall use these observations, 
plus the usual 'critical droplet' argument from bootstrap percolation on [n]'^. 

Let's begin with the (easier) upper bound. Let n^py^)^ ^ logn, and consider m = logn 
copies Gi, . . . , Gjn of G^p. We claim that their union percolates with high probability. To 
see this, first observe that Gn,p contains an r-clique Ri with high probability. Next, note 
that the expected number of copies of K^. plus a pendant edge, with its endpoint in /2i, is at 
least |i?i|("'~|. ^')pv2j+ ^ logn. Using Chebychev, it follows that there exist at least logn 
such copies with high probability, and the closure of these is a clique R2 on at least log n 
vertices. Now, simply repeat this procedure for each graph G3, . . . , Gm- A straightforward 
calculation shows that, with high probability, at each step the clique Rj (at least) doubles in 
size, until it reaches size 1/p. But now a positive fraction of the vertices have r neighbours 
in Rm-2, so l-Rm-il ^ ^'n, and thus \Rm\ = n with high probability, as required. 

To prove the lower bound, we define a process analogous to the Clique Process in Section HJ 
To be precise, we can break up the process into steps of the following two types: (a) if two 
(^ 2r)-cliques share two vertices then merge them, and (b) if an edge is infected then consider 
the copy of H it completes, and merge the (^ 2r)-cliques which provided the edges oi H — e. 
To see that this works, recall that {DDr — e)DDr = -^2r- 

Using this process, we can easily prove a result analogous to Lemma [HJ except with 3 
replaced by e{H). Indeed, at each step the size of the largest clique increases by at most a 
factor oie{H). Moreover, by considering the penultimate step of the process, as in Lemma [T8| 
and using induction, we can prove the following extremal result: If {G)oDr = ^n and n ^ r, 
then 



The result now follows by a straightforward (and standard) calculation (using Markov), as 
in the proof of Proposition [191 D 
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We now turn to some open problems. We begin by asking for sharper versions of Theo- 
rems [1] and O 

Problem 1. Determine Pc{n, Kr) up to a constant factor. 

Problem 2. Find 1/4 ^ a ^ 5, if it exists, such that 

p,{n,K,) = (l + o(l)). " 



y/n log n 

A natural family of graphs for which we do not have good bounds on the critical probability 
are the complete bipartite graphs. 

Problem 3. Determine Pc{n, Ks^t) , o-t least up to a poly-logarithmic factor, for all s,t G N. 

Another is the random graph, G'fc,i/2- 

Problem 4. Give bounds on Pc{n, Gk.1/2) which hold with high probability as A; —)■ 00. 

Finally, we mention a substantial generalization of the problem we have considered in this 
paper. Given graphs G and H, define if -bootstrap percolation on G by only allowing edges 
of G to be infected, and say that a graph F percolates if, starting with F, eventually all 
edges of G are infected. It seems likely that there are many beautiful theorems to discover 
about this very general bootstrap process. 
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